Record search metrics on cancelation by rdettai · Pull Request #5743 · quickwit-oss/quickwit

rdettai · 2025-04-10T14:42:25Z

Description

This PRs moves [root|leaf]_search_requests_total [root|leaf]_search_request_duration_seconds [root|leaf]_search_targeted_splits into tasks so that they are recorded even on future cancelation. This will help having a more accurate view on queries that timeout.

How was this PR tested?

Describe how you tested this PR.

guilload · 2025-04-10T15:04:19Z

Writing a custom future would be much cleaner:
https://github.com/quickwit-oss/quickwit/blob/main/quickwit/quickwit-common/src/tower/metrics.rs

rdettai · 2025-04-10T15:17:39Z

quickwit/quickwit-search/src/leaf.rs

-    let label_values = match leaf_search_response_reresult {
-        Ok(Ok(_)) => ["success"],
-        _ => ["error"],
-    };
-    SEARCH_METRICS
-        .leaf_search_targeted_splits
-        .with_label_values(label_values)
-        .observe(num_splits as f64);
-


Was there any specific reason why this was measured per index here?

rdettai · 2025-04-14T12:12:17Z

Writing a custom future would be much cleaner

Good point, I updated the PR to use a future.

For the root search the state machine is a bit more complicated because the request can also fail/cancel during planning (fetch of the split metadata). I did some refactoring to have a clearer split between the plan and exec phases, and applied a tracking future to both. All in all it's probably not much simpler than my original proposal, but I think it's a bit more readable.

I also refactored a few functions names to make it clearer the granularity at which we are working (multi-index vs single doc mapping)

rdettai · 2025-04-14T12:23:50Z

quickwit/quickwit-search/src/root.rs

+            (RootSearchMetricsStep::Plan, Some(false)) => (0, "plan-error"),
+            (RootSearchMetricsStep::Plan, None) => (0, "plan-cancelled"),


These extra statuses seem valuable, but we should also try to avoid creating too many series. No strong opinion on whether we should have these or not.

(On that topic, I think root_search_requests_total is actually redundant because root_search_request_duration_seconds is a histogram and that creates a _count series)

quickwit/quickwit-search/src/leaf.rs

quickwit/quickwit-search/src/root.rs

rdettai commented Apr 10, 2025

View reviewed changes

rdettai force-pushed the cancelation-search-metrics branch 2 times, most recently from 082adcb to 81b6b7f Compare April 14, 2025 12:03

rdettai commented Apr 14, 2025

View reviewed changes

rdettai requested a review from guilload April 14, 2025 12:24

rdettai force-pushed the cancelation-search-metrics branch from 81b6b7f to 86b6611 Compare April 17, 2025 13:44

guilload approved these changes May 13, 2025

View reviewed changes

rdettai added 6 commits May 14, 2025 11:53

Record search metrics on cancelation

bc48a9b

Refactor to clarify leaf search levels

b627d12

Improve root search cancellation states

ab5b9fc

Replace leaf search cancelation tracking task with future

1e4746d

Replace root search cancelation traking task with future

e0fa683

Refactor metrics trackers to their own file

6aebdbe

rdettai force-pushed the cancelation-search-metrics branch from ca20f2e to 6aebdbe Compare May 14, 2025 09:55

rdettai merged commit 2e36400 into main May 14, 2025
8 checks passed

rdettai deleted the cancelation-search-metrics branch May 14, 2025 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record search metrics on cancelation#5743

Record search metrics on cancelation#5743
rdettai merged 6 commits intomainfrom
cancelation-search-metrics

rdettai commented Apr 10, 2025

Uh oh!

guilload commented Apr 10, 2025

Uh oh!

rdettai Apr 10, 2025

Uh oh!

rdettai commented Apr 14, 2025 •

edited

Loading

Uh oh!

rdettai Apr 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		(RootSearchMetricsStep::Plan, Some(false)) => (0, "plan-error"),
		(RootSearchMetricsStep::Plan, None) => (0, "plan-cancelled"),

Conversation

rdettai commented Apr 10, 2025

Description

How was this PR tested?

Uh oh!

guilload commented Apr 10, 2025

Uh oh!

rdettai Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

rdettai commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rdettai Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rdettai commented Apr 14, 2025 •

edited

Loading